A Unified View of Spectral Clustering∗

نویسندگان

  • Desmond J. Higham
  • Milla Kibble
چکیده

We formulate a discrete optimization problem that leads to a simple and informative derivation of a widely used class of spectral clustering algorithms. Regarding the algorithms as attempting to bi-partition a weighted graph with N vertices, our derivation indicates that they are inherently tuned to tolerate all partitions into two non-empty sets, independently of the cardinality of the two sets. This approach also helps to explain the difference in behavior observed between methods based on the unnormalized and normalized graph Laplacian. We also give a direct explanation of why Laplacian eigenvectors beyond the Fielder vector may contain fine-detail information of relevance to clustering. Another advantage of our discrete formulation is that it admits a random graph interpretation, showing that spectral clustering may be viewed as maximum likelihood partitioning under the assumption that the data is an instance of a graph with random edge weights. The resulting distribution on the weights formalizes and quantifies the intuitive notion that vertices in the same cluster are more likely to have high weights than vertices in different clusters. Numerical experiments that illustrate the analysis are included. keywords: balancing threshold, Rayleigh-Ritz Theorem, Fiedler vector, graph Laplacian, random graph, maximum likelihood, partitioning. AMS Subject Classification: 65F15, 90C27, 05C85 This manuscript appears as University of Strathclyde Mathematics Research Report 02 (2004). Department of Mathematics, University of Strathclyde, Glasgow G1 1XH, UK. Supported by Research Fellowships from The Leverhulme Trust and The Royal Society of Edinburgh/Scottish Executive Education and Lifelong Learning Department. Department of Mathematics, University of Turku, FIN-20014 Turku, Finland. Supported by the Academy of Finland, under grant number 53441.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Least-Squares Unified View of PCA, LDA, CCA and Spectral Graph Methods

Over the last century Component Analysis (CA) methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA) and Spectral Clustering (SC) have been extensively used as a feature extraction step for modeling, classification, visualization, and clustering. This paper proposes a unified framework to formulate PCA, LDA, CCA, and SC as a ...

متن کامل

From Ensemble Clustering to Multi-View Clustering

Multi-View Clustering (MVC) aims to find the cluster structure shared by multiple views of a particular dataset. Existing MVC methods mainly integrate the raw data from different views, while ignoring the high-level information. Thus, their performance may degrade due to the conflict between heterogeneous features and the noises existing in each individual view. To overcome this problem, we pro...

متن کامل

A Unified Framework for Discrete Spectral Clustering

Spectral clustering has been playing a vital role in various research areas. Most traditional spectral clustering algorithms comprise two independent stages (i.e., first learning continuous labels and then rounding the learned labels into discrete ones), which may lead to severe information loss and performance degradation. In this work, we study how to achieve discrete clustering as well as re...

متن کامل

A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts

Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel k -means are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel k -means objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning ob...

متن کامل

Imagerank: spectral techniques for structural analysis of image database

Drawing on the correspondence between spectral clustering, spectral dimensionality reduction, and the connections to the Markov Chain theory, we present a novel unified framework for structural analysis of image database using spectral techniques. The framework provides a computationally eficient approach to both clustering and dimensionality reduction, or 2-D visualization. Within this framewo...

متن کامل

Guided Co-training for Large-Scale Multi-View Spectral Clustering

In many real-world applications, we have access to multiple views of the data, each of which characterizes the data from a distinct aspect. Several previous algorithms have demonstrated that one can achieve better clustering accuracy by integrating information from all views appropriately than using only an individual view. Owing to the effectiveness of spectral clustering, many multi-view clus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004